Single classification tree.
The Gini Index is referred to as a measure of node purity (James et al. 2021). It can also be used to measure the importance of each predictor. The Gini Index is defined by the following formula where K is the number of classes and \({\hat{p}_{mk}}\) is the proportion of observations in the mth region that are from the kth class. A Gini Index of 0 represents perfect purity.
\[D=-\sum_{n=1}^{K} {\hat{p}_{mk}}(1-\hat{p}_{mk})\]
Bagging is the aggregation of the results from each decision tree. It is defined by the following formula where B is the number of training sets and \(\hat{f}^{*b}\) is the prediction model. Although bagging improves prediction accuracy, it makes interpreting the results harder as they cannot be visualized as easily as a single decision tree (James et al. 2021).
\[{\hat{f}bag(x) = 1/B \sum_{b=1}^{B}\hat{f}^{*b}(x)}\]
Default
Tuned